value abs_value
1 -2 2
2 -1 1
3 0 0
4 1 1
5 2 2
September 23 + 25, 2024
The goal of simulating a complicated model is not only to create a program which will provide the desired results. We also hope to be able to write code such that:
ifelse())case_when())set.seed(4747)
diamonds %>% select(carat, cut, color, price) %>%
sample_n(20) %>%
mutate(price_cat = case_when(
price > 10000 ~ "expensive",
price > 1500 ~ "medium",
TRUE ~ "inexpensive"))# A tibble: 20 × 5
carat cut color price price_cat
<dbl> <ord> <ord> <int> <chr>
1 1.23 Very Good F 10276 expensive
2 0.35 Premium H 706 inexpensive
3 0.7 Good E 2782 medium
4 0.4 Ideal D 1637 medium
5 0.53 Ideal G 1255 inexpensive
6 2.22 Ideal G 14637 expensive
7 0.3 Ideal G 878 inexpensive
8 1.05 Ideal H 4223 medium
9 0.53 Premium E 1654 medium
10 1.7 Ideal H 7068 medium
11 0.31 Good E 698 inexpensive
12 0.31 Ideal F 840 inexpensive
13 1.03 Ideal H 4900 medium
14 0.31 Premium G 698 inexpensive
15 1.56 Premium G 8858 medium
16 1.71 Premium G 11032 expensive
17 1 Good E 5345 medium
18 1.86 Ideal J 10312 expensive
19 1.08 Very Good E 3726 medium
20 0.31 Premium E 698 inexpensive
sample())sampling, shuffling, and resampling: sample()
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
[1] "i" "b" "g" "d" "a"
[1] "j" "g" "f" "i" "f"
[1] "f" "h" "i" "e" "g" "d" "c" "j" "b" "a"
[1] "e" "j" "e" "b" "e" "c" "f" "a" "e" "a"
Three simulating methods are used for different purposes:
Monte Carlo methods - use repeated sampling from a population with known characteristics.
Randomization / Permutation methods - use shuffling (sampling without replacement from a sample) to test hypotheses of “no effect”.
Bootstrap methods - use resampling (sampling with replacement from a sample) to establish confidence intervals.
set.seed())What if we want to be able to generate the same random numbers (here on the interval from 0 to 1) over and over?
Consider a situation where Sally and Joan plan to meet to study in their college campus center (Mosteller 1987; Baumer, Kaplan, and Horton 2021). They are both impatient people who will wait only 10 minutes for the other before leaving.
But their planning was incomplete. Sally said, “Meet me between 7 and 8 tonight at the student center.” When should Joan plan to arrive at the campus center? And what is the probability that they actually meet?
Assume that Sally and Joan are both equally likely to arrive at the campus center anywhere between 7pm and 8pm.
The results themselves are equivalent. Differing values due to randomness in the simulation.